1. R# language design
-
- Variable
- 2.1. Append Vector
- Variable
-
- Types
- 3.1. imports .NET type
- 3.2. unit type
- Types
-
- Operators in R#
- 7.1. Logical operators
- 7.2. Dynamics Operator Binding
- 7.2.1. User operator * 7.3. pipeline operator * 7.4. IN operator
- 7.4.1. combine with
Which
operator - 7.4.2. Difference of
in
andbetween
- Operators in R#
-
- Using tuple
- 10.1. R object to tuple
- Using tuple
-
- Linq Query
- 11.1. Query R dataframe
- 11.2. Join two data source
- Linq Query
1. Code comments
## This is code comments, it just only allow single line comments.
2. Variable
Variable in R#
should be declared by a let
/const
keyword, and using <-
or =
operator for value initialize by a expression. If the variable declaration not follow by a value initialize expression, then by default its value is set to NULL
:
let s <- "12345";
const x <- [1, 2, 3, 4, 5];
let matrix <- [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
];
let x;
# is equals to
let x <- NULL;
let
keyword is allowed appears in a closure(function/if/loop, etc) body or top level scopeconst
keyword is almost doing the same thing as thelet
keyword it does, but the different ofconst
keyword it does is mark the target symbol is readonly. RunlockBinding
on target symbol automatically.
let x = 9;
print(x);
# [1] 9
x <- FALSE;
print(x);
# [1] FALSE
const y = 9;
print(y);
#[1] 9
y <- FALSE;
# error, symbol y is readonly
Delcare a vector or matrix will no longer required of the c(...)
function or matrix()
function. Almost keeps the same as the VB.NET language it does:
Dim s = "12345"
Dim x = {1, 2, 3, 4, 5}
Dim m = {
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
}
let s = "12345";
let x = [1, 2, 3, 4, 5];
let m = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
];
By default, all of the primitive types in R# is an vector, and user defined type is a single value. So that in R#, integer
means an integer type vector, char
means a character type vector, or string value. string
type in R# is a character
vector.
In the traditional R language, you can using both =
or <-
operator for value assign, all of these two operator are both OK. But in R#
language these two operator have slightly difference: value asssign using <-
operator means ByVal
, and value assign using =
operator means ByRef
:
let a <- [1, 2, 3, 4, 5];
let b1, b2 as integer;
b1 <- a; # ByVal
b2 = a; # ByRef
a[6] = 99;
b1;
# ByVal means clone the source values, so that when the source have been change,
# the cloned variable its value will not changed too.
# ByVal means values remains the same as the source:
# [1] 1 2 3 4 5
b2;
# ByRef means reference to the source memory location pointer,
# so that when the source have been changed, then the reference variable will be changed too.
# ByRef means reference of the source memory.
# [1] 1 2 3 4 5 99
# So that if you have change the ByRef b2 its value, the source is also changed:
b2[1] = 88;
b2;
# [1] 88 2 3 4 5 99
a;
# [1] 88 2 3 4 5 99
b1[2] = -10;
b1;
# [1] 1 -10 3 4 5
a;
# [1] 88 2 3 4 5 99
2.1. Append Vector
You can using append()
function for append a vector in R language, and in R# you can using both append()
function and left shift <<
operator for append a vector:
v <- append(a, b)
## left shift means append b into vector a and then creates a new vector
v <- a << b
NOTE: As the R#
language is not designed for general programming, the most usaged of R#
language is used for cli/data scripting inside GCModeller environment, so <<
and >>
these two bit shift operator in VB.NET language is no longer means for bit operation any more in R#
language.
- The
<<
operator is usually used for array push liked operation inR#
language; - And the
>>
operator is usually used for data file save operation.
3. Types
R#
language have several primitive type, by default all of them are vector type:
primitive type in R | .NET type |
---|---|
integer |
System.Int64 vector |
double |
System.Double vector |
uinteger |
System.UInt64 vector |
string |
System.String vector |
char |
System.Char vector |
boolean |
System.Boolean vector |
Generally, the R language is not designed as an OOP language, and the R# language is not designed as an OOP lnaguge too. But you can still declare the user type by using list()
function, example like:
# it works, but too verbose
let obj <- list();
obj$a <- 123;
obj$b <- "+++";
# it also works, but this statement is not elegant when you have
# a lot of list property slot required to put into the R# list object
# variable.
let obj <- list(a = 123, b = "+++");
# using with for object property initialize
let obj <- list(list = [TRUE, TRUE, TRUE, FALSE], flag = 0) with {
$a <- 123;
$b <- "+++";
}
Using with{}
closure can makes the property initialize at the same time when you create your user type by using list()
function. Just like what you does in VisualBasic:
Dim obj As New <userType> With {
.a = 123,
.b = "+++"
}
Using the with{}
closure you can also using for dynamics add/modify property value in a more brief way:
let create_Foo as function() {
return list() with {
# please notice that, this operator required of the $name property in this user type
# But there is no name in it at all????
# Don't worried
%in% <- function($, str) {
return $name in str;
}
}
}
let using_my_name as function() {
return create_Foo() with {
$name <- "123";
}
}
let using_selected_name as function(names as string) {
return create_Foo() with {
$name <- names;
}
}
# The ``with{}`` closure which show above is equivalent to this code
# But this code function is too verbose
let using_selected_name as function(names as string) {
let o <- create_Foo();
o$name <- name;
return o;
}
generally, the parameter in a R# function is generic type, so that a function its definition like:
let test as function(x) {
# ...
}
can accept any type you have input. but you can using the param as <type>
for constraint the type to a specific type(currently the user type that produced by list()
function is not supported by this type constraint feature):
let test.integer as function(x as integer) {
# the type constraint means the parameter only allow the integer vector type
# if the parameter is a string vector, then the interpreter will throw exceptions.
}
3.1. imports .NET type
you can use the new
keyword for create the imported .NET type in R# language:
let a = new <name>(...);
let b = new <name>() with {
$a = ...;
$b = ...;
$c = $a + $b;
};
for implements such programming feature, then you should make sure about something:
- the .NET object type should be public visible
- target .NET object type should have one parameterless constructor
- export type at the top of your package module.
3.2. unit type
There is a unit type R# language feature can let you mark the numeric data
let x as integer = [1024, 2, 3, 4, 5];
let s as integer = 30;
unit(x) <- "GB";
unit(s) <- "sec";
print(x /s);
4. Get/Set value
Get/Set property value keeps the same as the R language:
let names <- dataframe[, "name"];
let new.names <- ["a", "b", "c"];
dataframe[, "name"] <- new.names;
5. 1.5. Function and lambda function
For declare a function, use
let add as function(a, b) {
a + b;
}
If your function is simply enough, and didn’t modify the environment, example like didn’t have operation of variable value assignment. Then you should consider use the lambda function. The lambda function is more light weight and elegant than the normal function. The lambda function just allows one expression in its body, and only allows one parameter.
let add.lambda = [a,b] -> a+b;
6. String
Add new string contact and string interploate feature for R#
, makes you more easier in the string manipulation:
let first.name <- "a";
let last.name <- "b";
let my.name <- `${first.name} ${last.name}`;
# sprintf function is still avaliable
let his.name <- sprintf("%s %s", first.name, last.name);
7. Operators in R#
7.1. Logical operators
The R#
language using the VisualBasic logical operator system, as the &
operator is conflicts with the string contact and |
operator is conflicts with the pipeline operator.
&&
replaced byand
,andalso
||
replaced byor
,orelse
!
replaced bynot
if (x <= 10 andalso y != 99) {
# ......
} else if(not z is null) {
# ......
}
7.2. Dynamics Operator Binding
Allows you binding operator on your custom type in dynamics way when you create a user objec from list()
function:
# binding operator only allows in the with closure in the object declare statement
var me <- list() with {
%+% <- function($, other) {
}
%is% <- function($, other) {
}
}
# and then using the operator
var new.me <- me + other;
var predicate <- me is other;
if (not me is him) {
# ......
}
Allows user operator
Operator | Description |
---|---|
+ |
add |
- |
substract |
* |
multiply |
/ |
devide |
\ |
integer devide |
% |
mod |
^ |
power |
is |
object equals |
like |
object similarity |
in |
in collection set |
which |
index list for true |
between |
in a given range |
7.2.1. User operator
In the R language, you can define a user operator, example like:
`%NOT_IS%` <- function(x, y) x != y;
1 %NOT_IS% 1
# [1] FALSE
1 %NOT_IS% 2
# [1] TRUE
This user operator declare just enabled on the binary operator, and the unary operator does not. but in R# language, you can also enable the unary operator, example like:
`%pp%` <- function(x) (x + 10)^2;
var y as integer <- 10;
var x <- %pp% y;
x;
# [1] 400
7.3. pipeline operator
Extension caller chain in VisualBasic is also named as function pipeline
<Extension> Function test1(x)
End Function
<Extension> Function test2(x, y)
End Function
<Extension> Function test3(a)
End Function
Dim result = "hello world!" _
.test1 _
.test2(99) _
.test3
All of the R function which have at least one parameter can be using in pipeline mode, using |
as the pipeline operator:
# pipeline in linux bash
ps -ef | grep R.exe
and you can do this pipeline programming in R#
# application foo print its content output to standard output on the console
# and then calling the replace function, at last capitalize all
# of the string result
["foo = bar"]
:> replace("foo", "bar")
:> capitalize
# BAR = BAR
let test1 as function(x) {
# ...
}
let test2 as function(x, y) {
# ...
}
let test3 as function(a) {
# ...
}
# Doing the exactly the same as VisualBasic pipeline in R language:
let result <- ["hello world!"]
:> test1
:> test2(99)
:> test3
;
# or you can just using the R function in normal way, and it is too much complicated to read:
let result <- test3(test2(test1("hello world"), 99));
Note: Unlike the unix bash pipeline, operations can be keeps in the sample line, the R# pipeline syntax, require all of the pipeline content should be in different lines:
# This is the correct pipeline syntax in R#
# Pipeline in multiple line mode will makes your code comment more elegant, and more easy to understand
list(a=123, b= TRUE, c="123")
:> rep(10) # replicate 10 times of the value from list functiuon
:> rbind() # rbind these replicated values as a dataframe
:> write.csv(file="./abc.csv") # save the resulted data frame as csv file
;
# The pipeline example is equals to these code in R:
x <- list(a=123, b= TRUE, c="123");
x <- rep(x, 10);
x <- rbind(x);
x <- write.csv(x, file="./abc.csv");
# Invalid syntax example
# The R# interpreter can not recognized it as the pipeline, if all of the pipeline operation in the same line.
# And it is not so easy to add code comments for each function calls if all of the function are
# in same line:
list(a=123, b= TRUE, c="123")|rep(10)|rbind|write.csv(file="./abc.csv");
In VisualBasic, the function pipeline required user imports all of the namespace for the extension function. But in R# function this is not reuqired, and you can reference the package namespace in your R function pipeline, example as:
# assume we have two function with the same name: func1, but in different namespace
# so we can apply these two function in pipeline mode, like
foo_value
:> namespace.1::func1() # using the func1 in namespace.1
:> namespace.2::func1() # using the func1 in namespace.2
;
7.4. IN operator
The in
operator means does the element in the target collection? returns a boolean vector for indicate exists or not exists.
# in list
local booleans <- name in names(obj);
# in range
# means x >= min and x <= max
local booleans <- x between [min, max];
7.4.1. combine with Which
operator
The which
operator gets the index of the value TRUE
in a boolean vector:
let x <- [1, 2, 3, 4, 5];
let indices.true <- which x between [min, max];
7.4.2. Difference of in
and between
the in
operator is apply for collection set element exact match and the between
operator is apply for the numeric range test,
example as:
# in test each element in x collection
# for exact match in collection b
#
# each x in b?
[1, 2, 2.5, 3, 4, 5] in [2, 3]
# result:
# [1] FALSE TRUE FALSE TRUE FALSE FALSE
# between test each element in x collection
# is in a given value range?
[1, 2, 2.5, 3, 4, 5] between [2, 3]
# result:
# [1] FALSE TRUE TRUE TRUE FALSE FALSE
8. []
bracket in R language
Global variable:
var g <- "test";
test <- function(g as integer) {
# just like the VisualBasic language, you can using [] bracket
# for eliminates the object identifier conflicts in R language.
# string contact of the parameter g with global variable [g]
return g:ToString("F2") & [g];
}
Range generator:
[min:max,step]
if (mz in [mz.min:mz.max]) {
# range generator only allows numeric type
}
tuple variable:
# run commandline using @ operator in R
let prot.fasta = "/home/biostack/sample.fasta";
let [exitCode, std_out] <- @`makeblastdb -in "${prot.fasta}" -dbtype prot`;
list
element accessor:
The R# list element accessor is different with the R language, example as R language:
l <- list(a = 11, b = 22, c = TRUE);
l[["a"]];
# [1] 11
l[[1]];
# [1] 11
R# try to make it more simple, example as:
l <- list(a = 11, b = 22, c = TRUE) with {
$"1" <- |3, 4, 5, 6|;
};
# access list element by name
l["a"];
# [1] 11
l["1"];
# [4] 3 4 5 6
# access list element by index
l[&1];
# [1] 11
Please notice that the term "1"
is totaly differently with %1%
, as the term "1"
means accessor by property name, and the term %1%
means accessor by vector element index.
9. IO operation
## You can using right shift operator for write data into file
x >> [options]
x >> path
The right shift write operator is based on the data type of object x
:
- If type of
x
is a dataframe or matrix, thenwrite.csv
/write.table
function will be called. - If type of
x
is generic type, thensave
function will be called. - If type of
x
is vector of the primitive types, then the data it will be saved asjson
/txt
/csv
file.
The file format is depends on the file extension shuffix, and this feature is only works for
rule 1
andrule 3
,rule 2
for generic type is only can be saved inrda
binary file.
Example as:
let vector = [1,2,3,4,5,6];
vector >> "./index.json";
vector >> "./index.csv";
vector >> "./index.txt";
let matrix =
[|1,2,3|,
|4,5,6|,
|8,8,8|];
colnames(matrix) <- |"A","B","C"|;
matrix >> "./matrix.csv"; # A csv file will be generated.
matrix >> "./matrix.txt"; # A tsv file will be generated.
9.1. Simple external calls
The R#
language makes more easier for calling external command from CLI, apply a @
operator on a string vector will makes an external system calls:
let [exitCode, stdout] <- @'/bin/GCModeller/localblast /blastp /query "{query.fasta}" /subject "{COG_myva}" /out "{COG_myva.csv}"';
# or makes it more clear to read
let CLI <- '/bin/GCModeller/localblast /blastp /query "{query.fasta}" /subject "{COG_myva}" /out "{COG_myva.csv}"';
let [exitCode, stdout] <- @CLI;
10. Using tuple
Tuple enable the R function returns multiple value at once:
# this R function returns multiple value by using tuple:
let tuple.test as function(a as integer, b as integer) {
[a, b, a ^ b];
}
let tuple.test2 as function(a, b) {
# list is also works fine
list(a, b, c = a^ b);
}
# and you can using tuple its member as the normal variable
let [a, b, c] <- tuple.test(3, 2);
[a, b, c] = tuple.test2(100, 1);
if (a == 3) {
c <- c + a + b;
# or using pipeline
c <- [a, b, c]
:> sum
;
}
Dim tuple_test = Function(a As Integer, b As Integer)
Return (a, b, a ^ b)
End Function
Dim x As (a, b, c) = tuple_test(3, 2)
If x.a = 3 Then
' using pipeline
Dim c = {x.a, x.b, x.c}.Sum
End If
10.1. R object to tuple
You can naturally convert the object as tuple value. The member in the tuple their name should matched the names in an object, so that you can doing something like this example in R#
:
let obj <- list() with {
$a <- 333;
$b <- 999;
}
# the tuple its member name should match the property name in you custom object type
# no order required in your tuple declaration:
let [a, b] <- obj;
But, wait, if the property in an object is not a valid identifier name in R#
? Don’t worried, you can using alias:
let obj <- list() with {
$"112233+5" <- 999;
$x <- 1;
}
let [a as "112233+5", b as "x"] <- obj;
The tuple feature is espacially useful in operates the dataframe:
let d <- data.frame(
a = [ 1, 2, 3],
b = [ "a", "g", "y"],
t = [TRUE, TRUE, FALSE]);
# in a for loop, the tuple its member value is the cell value in dataframe
for([a, b, c as "t"] in d) {
println("%s = %s ? (%s)", a, b, c);
}
# 1 = a ? (TRUE)
# 2 = g ? (TRUE)
# 3 = y ? (FALSE)
# if directly convert the dataframe as tuple,
# then the tuple member its value is the column value in the dataframe
let [a, b, booleans as "t"] <- d;
a;
# [3] 1 2 3
b;
# [3] "a" "g" "y"
booleans;
# [3] TRUE TRUE FALSE
If the tuple is applied on a for loop, then it means convert each row in dataframe as tuple, or just applied the tuple on the var declaring, then it means converts the columns in dataframe as the tuple, so that the variable in tuple is a vector with nrows of the dataframe.
11. Linq Query
The R#
language have native supports of the Linq query syntax like VisualBasic.NET language:
let result = FROM x as double # query object
IN [a,b,c,d,e] # data source
WHERE predicate(x) # subset
ORDER BY x DESCENDING # option pipeline
SKIP 100 # option pipeline
TAKE 20 # option pipeline
;
Linq query in R# follows the rule for generate output result based on different data source:
from data source | produce | equals to | note |
---|---|---|---|
list | list | lapply | always produce a new list |
vector | vector/list | sapply/lapply | based on the projection result: single is vector and compounds data will produce list |
dataframe | dataframe | dataframe subset/projection syntax | linq query is working as SQL, always produce a new dataframe table |
11.1. Query R dataframe
The linq query of the dataframe object in R#
is by row, and the column names that used in query shoud be selected in the FROM
closure, example as:
# query dataframe
const demo = data.frame(x = x(), y = y(), z = z());
# make a table subset by rows
let subset = FROM [x, y, z]
IN demo
WHERE predicate(x, y, z)
ORDER BY z + x DESCENDING
;
# will produce a subset output
# x, y, z
# make a table projection by and subset by row
let proj = FROM [x, y, z]
IN demo
WHERE predicate(x, y, z)
SELECT x, z # project x and z field produce a new dataframe subset
;
# will produce a table projection output
# x, z
11.2. Join two data source
As the mysql language, linq in R# language has the join function on two data source too:
# make query after join two table
const employee = data.frame(
ID = [1,2,3,4,5,6],
Name = ["Preety", "Priyanka", "Anurag", "Pranaya", "Hina", "Sambit"],
AddressId = [1, 2, 0,0,5,6]
);
const address = data.frame(
ID = [1, 2, 5 , 6],
AddressLine = ["AddressLine1","AddressLine2","AddressLine5","AddressLine6"]
);
# the join operation
let result = FROM [ID, Name, AddressId] IN employee
JOIN [ID, AddressLine] IN address
ON employee.AddressId == address.ID
WHERE employee.ID > 5
;
# will produce a join table output
# employee.ID, Name, AddressId, address.ID, AddressLine